Communications Chemistry
○ Springer Science and Business Media LLC
Preprints posted in the last 30 days, ranked by how well they match Communications Chemistry's content profile, based on 39 papers previously published here. The average preprint has a 0.04% match score for this journal, so anything above that is already an above-average fit.
Kuyler, G. C.; Murray, R. J.; Khwaja, F. N.; Gunner, J.; Klumperman, B.; Poyner, D.; Ayub, H.; Wheatley, M.
Show abstract
Detergent-free extraction of membrane proteins using polymers directly into nanodiscs from the cell membrane has been used widely in recent years. Since the first use of poly(styrene-co-maleic acid) (SMA), numerous related polymers have been developed that differ in chemical architecture and nanodisc characteristics, each capable of influencing the structural and functional properties of the encapsulated membrane protein and its surrounding lipids. Identifying an optimal solubilising polymer, therefore, requires consideration not only of extraction efficiency but also compatibility with downstream applications and analyses. Polymer series in which a single parameter is systematically varied provide a valuable, nuanced tool for optimising nanodisc utility in downstream applications. This study utilises a chemically defined series of poly(styrene-co-maleic acid-co-(N-benzyl)maleimide) (BzAM) terpolymers that exhibit a stepwise, systematic increase in hydrophobicity. Using the human calcitonin gene-related peptide (CGRP) receptor as an exemplar class B1 G-protein-coupled receptor (GPCR), the ability of each BzAM terpolymer to solubilise the receptor from mammalian cell membranes was assessed. All members of the series successfully solubilised CGRP receptor, with solubilisation efficiency correlating positively with increasing hydrophobicity. Importantly, the receptor retained its characteristic high-affinity ligand-binding capability when encapsulated within the BzAM nanodisc, demonstrating that functional integrity is preserved following BzAM-mediated extraction and purification. These findings establish the BzAM terpolymer series as a systematic, tuneable, well-defined tool for the detergent-free solubilisation and functional investigation of GPCRs, and other membrane proteins, in near-native lipid environments. HIGHLIGHTSO_LIStepwise-tuned poly(styrene-co-maleic acid-co-(N-benzyl)maleimide) (BzAM) terpolymers provide a chemically defined, hydrophobicity-controlled platform for detergent-free membrane protein extraction. C_LIO_LIAll BzAM variants effectively solubilise the human calcitonin gene-related peptide (CGRP) receptor, with extraction efficiency increasing in line with terpolymer hydrophobicity. C_LIO_LICGRP receptor maintains high-affinity ligand binding in BzAM nanodiscs, demonstrating preservation of ligand-binding function after solubilisation. C_LIO_LIThe BzAM series provides a novel platform for studying G-protein-coupled receptors and other membrane proteins in near-native lipid environments, with the potential to deliver mechanistic insights and support future drug-discovery efforts. C_LI GRAPHICAL ABSTRACT O_FIG O_LINKSMALLFIG WIDTH=200 HEIGHT=110 SRC="FIGDIR/small/726474v1_ufig1.gif" ALT="Figure 1"> View larger version (38K): org.highwire.dtl.DTLVardef@1cb167corg.highwire.dtl.DTLVardef@313e60org.highwire.dtl.DTLVardef@f64a2borg.highwire.dtl.DTLVardef@17f6629_HPS_FORMAT_FIGEXP M_FIG C_FIG
Chakraborty, A.; Khan, F.; Sharma, S.; Ameta, S.
Show abstract
The internal dynamics of liquid-liquid phase-separated systems are governed primarily by polymer packing, excluded-volume effect, and interactions between polymers and encapsulated macro-molecules. Although one immediate effect of such a constrained microenvironment is diffusion limitation, it remains unclear whether encapsulated macromolecules can also exhibit phase composition-specific functional behaviour that is not observable in a well-mixed aqueous environment. In this regard, different phases in a phase-separated environment can be accessed via a phase diagram that demarcates the region between two-phase (droplets) and one-phase (polymer-rich, no droplets) regimes. While the two-phase region is heterogeneous, most previous work on encapsulating functional macromolecules in phase-separated droplets uses a single point from the phase diagram. This leaves a clear gap in understanding on how the function scales across this landscape of droplets and identifying regions advantageous for the encapsulated macromolecule and its function. Here, using the Spinach light-up RNA aptamer, we show that RNA function does not scale uniformly across the phase diagram. We show that RNA can exhibit phase composition-specific functional behaviour due to constraints imposed by the internal microenvironment of phase-separated droplets. Furthermore, using variants of the Spinach aptamer, we show that fluorescence activity differences among the variants vary differently with phase-separation regimes across the phase map, suggesting that some regions of the phase diagram can confer a selective advantage. Our results highlight the potential of liquid-liquid phase-separated internal microenvironments in guiding the differentiation of functional RNA variants, which could serve as a physical selection pressure in pre-cellular evolution.
Malewicz, K. B.; Robinson, K. E.; Brown, A. M.; Jeffrey, C. S.; Philbin, C. S.; McGlothlin, J. W.; Lemkul, J. A.; Feldman, C. R.
Show abstract
Coevolution proceeds through the evolution of traits that mediate ecological interactions and evolutionary outcomes. In the arms race between toxic Pacific newts (Taricha) and their garter snake predators (Thamnophis), this interface involves tetrodotoxin (TTX), an antipredator defense that inhibits nerve and muscle function by blocking voltage-gated sodium channels. In response, snakes have evolved TTX-resistant channels, in some cases leading to snake populations that are nearly invulnerable to TTX. For decades, newt TTX has been treated as a single defensive trait, yet TTX occurs as a family of structurally related analogs that may represent alternative defenses against snakes. Here, we characterize TTX analog diversity in all four species of Taricha and evaluate how these compounds interact with the sodium channels in coevolved garter snakes. Using LC-MS analysis of newt skin secretions, we detected a diverse suite of TTX analogs previously unrecognized in Pacific newts. We then used molecular docking models to evaluate interactions between various TTX analogs and variants of the skeletal muscle channel (Nav1.4) that span the range of TTX resistance in garter snakes. We found that some TTX analogs docked better than canonical TTX in resistant snake channels. Notably, we show that 11-deoxy-4-epi-TTX and 11-deoxy-TTX have favorable interactions with hydrophobic amino-acid substitutions in extremely resistant garter snake sodium channels, potentially circumventing predator resistance to canonical TTX. Our results suggest a complex arms race involving multiple newt TTX analogs and multiple snake sodium channel variants. As such, newts may keep pace with snakes by diversifying their arsenal of chemical weapons.
Haris Kulosmanovic, H.; Uguz, C.; DURDAGI, S.
Show abstract
Molecular similarity searching is a workhorse of cheminformatics, but the dominant Tanimoto/topological-fingerprint paradigm has well-known blind spots. It is highly sensitive to molecular size, suffers from steep activity cliffs, and frequently fails to retrieve scaffold-hopping bioisosteres. A complementary descriptor that has received comparatively little attention is global elemental composition. Despite the conceptual simplicity of comparing molecules by their elemental ratios, no widely deployed method exists for the statistically rigorous identification of "chemical twins" defined by stoichiometric proximity. We address this gap with TwinSAR (Stoichiometric Analysis and Retrieval), an adaptive kernel-based algorithm that combines three methodological innovations: (i) binary fingerprint blocking that partitions molecule by element-presence patterns and bounds the cost of all-pairs comparison from O(NM) to O({sum}nimi) enabling million/billion-scale searches; (ii) a per-block adaptive radial basis function (RBF) kernel whose precision parameter is calibrated independently for each fingerprint block via the median heuristic, providing fair similarity comparison across chemical sub-spaces of vastly different density; and (iii) a logit-transformed Z-score filter that maps bounded RBF scores onto an unbounded scale, allowing high-similarity pairs to be prioritized relative to the empirical score distribution of their own fingerprint block. TwinSAR is offered in two operating modes: (i) a deterministic BULK mode for exact reproducibility; and (ii) a stochastic FAST mode that achieved a 3.29x wall-clock speed-up in the present benchmark while preserving the similar unique-query and unique-target coverage. Statistical validation showed that detected twin pairs are 12.7x more similar in absolute ratio space than block-matched random pairs (p < 0.001), while a column-permutation negative control returned a median of zero spurious twins across three independent permutations. A controlled benchmark further established that an 8-element representation (single-element heavy-atom ratios) is sensitivity-equivalent to a comprehensive 254-element representation while running 3.55x faster. As a case study, TwinSAR was deployed in an end-to-end virtual screening pipeline against the BCL-2 target protein, where it reduced a 327,071-compound commercial library to a 390 focused candidate panel. The chemical interpretability of the retrieved twins is illustrated by their structural diversity around conserved heavy-atom skeletons. TwinSAR therefore provides a fast, conformation-free, and statistically principled prefilter that is fully orthogonal to topological fingerprints.
Excell, J.; Giardina, A.; Sakamoto-Rablah, E.; Royle, K.; Nunn, D.
Show abstract
Recombinant human lactopontin (rhLPN), an equivalent of human milk lactopontin, is of increasing interest for human nutrition applications due to its roles in mineral binding, gastrointestinal function and immune modulation. These properties depend strongly on post-translational modifications, particularly phosphorylation and glycosylation. Here, we report the production of rhLPN in Kluyveromyces lactis at laboratory and pilot scale and present a comprehensive molecular comparison with native human lactopontin (nhLPN) isolated from human milk. Mass spectrometry-based peptide mapping confirmed the primary structure and identified extensive phosphorylation, consistent with the native protein. Middle-up analyses demonstrated closely matched phosphoform distributions between rhLPN and nhLPN, while glycosylation profiling revealed a defined population of low-complexity O-glycoforms localized to the N-terminus. Functional assessment demonstrated substantially greater iron binding by phosphorylated rhLPN compared with dephosphorylated and non-phosphorylated forms. Similar phosphorylation-dependent behaviour was observed for bovine lactopontin, supporting a conserved role for phosphorylation in mineral interaction. Across five 750 L pilot scale batches, both phosphorylation and glycoform distributions were highly consistent, indicating robust process reproducibility. Together, these results demonstrate that rhLPN produced in K. lactis recapitulates key structural and functional attributes of nhLPN, supporting its suitability as a scalable ingredient for nutrition applications.
Sen, S.; Hoff, S. E.; Morozova, T. I.; Schnapka, V.; Bonomi, M.
Show abstract
Virtual screening has become an indispensable tool in modern structure-based drug discovery, enabling the identification of candidate molecules by computationally evaluating their potential to bind target proteins. The accuracy of such screenings critically depends on the quality of the target structures employed. Recent advances in protein structure prediction, particularly AlphaFold2, have revolutionized this field with unprecedented accuracy. However, AlphaFold2 models often exhibit limitations in local structural details, especially within binding pockets, which limit their utility for small molecule docking. In contrast, molecular dynamics simulations with accurate atomistic force fields can refine protein structures, but lack the ability to leverage the structural information provided by deep learning approaches. Here, we introduce bAIes, an integrative method that bridges this gap by combining physics-based force fields with data-driven predictions through Bayesian inference. Crucially, bAIes demonstrates a superior ability to discriminate between binders and non-binders in virtual screening campaigns, outperforming both AlphaFold2 and molecular dynamics-refined models. By enhancing the usability of AlphaFold2 models without requiring extensive experimental or computational resources, bAIes offers a convenient solution to a longstanding challenge in structure-based drug design, potentially accelerating the early phases of drug discovery.
Bories, S. C. A.; Lague, P.
Show abstract
Membrane association is governed by the thermodynamics of amino acid partitioning between water and the lipid bilayer. Here, we quantified amino acid side-chain insertion energetics in a 1-palmitoyl-2-oleoyl-sn-glycero-3-phosphocholine (POPC) bilayer using unbiased molecular dynamics simulations. Equilibrium depth distributions of 28 analogs, including multiple protonation states, were converted into potentials of mean force (PMFs) by Boltzmann inversion. The resulting PMFs reproduced the main features of bilayer partitioning. Hydrophobic analogs favored the bilayer core, aromatic analogs were stabilized in interfacial regions, and polar or charged analogs remained unfavorable in the hydrophobic interior. A diglycine analog representing the peptide backbone behaved similarly to uncharged polar residues. Depth-dependent pKa profiles and orientational analyses further showed how protonation equilibria and aromatic-ring alignment influence insertion energetics. Agreement with experimental hydrophobicity scales supports the robustness of the approach. These results provide an efficient and internally consistent framework for characterizing bilayer insertion energetics and establish a reference for future studies in more complex lipid environments. O_FIG O_LINKSMALLFIG WIDTH=198 HEIGHT=200 SRC="FIGDIR/small/723583v1_ufig1.gif" ALT="Figure 1"> View larger version (79K): org.highwire.dtl.DTLVardef@127b12org.highwire.dtl.DTLVardef@14de924org.highwire.dtl.DTLVardef@53b27org.highwire.dtl.DTLVardef@16e8ee4_HPS_FORMAT_FIGEXP M_FIG C_FIG SIGNIFICANCEMembrane-associated proteins represent a large fraction of the proteome and include many major drug targets, yet quantitative understanding of their interactions with lipid bilayers remains limited. Here, we present an unbiased molecular dynamics framework for systematically determining amino acid side-chain insertion free energies in a model bilayer. By deriving potentials of mean force directly from equilibrium depth distributions, this approach enables internally consistent comparisons across residue classes and protonation states without biasing restraints. The resulting free-energy profiles reproduce established hydrophobicity trends and show how protonation equilibria and aromatic-ring orientation modulate bilayer partitioning. This scalable strategy provides a quantitative reference for residue-level membrane thermodynamics and establishes a foundation for extending insertion energetics to more diverse lipid compositions and more complex membrane-associated systems.
Khwaja, F. N.; Gunner, J.; Thacker, E.; Abdolhay, Y.; Logan, R.; Kitchen, P.; Veprintsev, D.; Wheatley, M.; Poyner, D.; Ayub, H.
Show abstract
Class B1 G-protein-coupled receptors (GPCRs), such as the calcitonin gene-related peptide (CGRP) receptor and parathyroid hormone 1 (PTH1) receptor, require native lipid interactions to maintain signalling-competent conformations. However, conventional detergents disrupt these environments. Amphipathic copolymers offer a detergent-free alternative, yet the field still lacks a clear understanding of which polymer architectures best preserve active-state GPCR pharmacology, limiting their broader translational utility. Here, we examine how distinct copolymer chemistries influence the functional integrity of class B1 GPCRs by comparing SMA 2000, DIBMA-12, and the electroneutral sulfo-DIBMA. Using NanoLuciferase bioluminescence resonance energy transfer (NanoBRET) ligand-binding, competition, and mini-G-protein recruitment assays on nanodisc-encapsulated receptors, we show that all three copolymers maintain high-affinity extracellular ligand binding but differ markedly in their ability to preserve intracellular signalling. Despite lower receptor extraction efficiency, only sulfo-DIBMA support mini-Gs engagement at the CGRP receptor and enable G-protein-dependent allosteric modulation at the PTH1 receptor, including conserved ligand affinity and prolonged residence time. These data reveal that polymer charge and backbone chemistry, rather than extraction yield, determine whether native-like nanodiscs retain the conformational landscape required for active-state signalling. Controlling non-specific ligand binding to the copolymer is a key requirement for a successful assay. Our findings identify sulfo-DIBMALP as a particularly superior environment for preserving native signalling behaviour in class B1 GPCRs, highlighting copolymer chemistry as an important determinant in detergent-free membrane protein studies. HIGHLIGHTSO_LISulfo-DIBMA encapsulated nanodiscs preserve active-state conformation of human calcitonin gene-related peptide receptor and parathyroid hormone 1 receptor. C_LIO_LIAll three copolymers (SMA 2000, DIBMA-12 and sulfo-DIBMA) preserve extracellular ligand binding but only sulfo-DIBMA preserves intracellular functional competence, including mini-Gs recruitment and G-protein-dependent allosteric modulation. C_LIO_LICopolymer chemistry, particularly the electroneutral, aliphatic nature of sulfo-DIBMA, may influence the preservation of signalling-competent states in two class B1 GPCRs by minimising charge-driven perturbations during solubilisation. C_LIO_LISulfo-DIBMALP provides a novel platform for studying dynamic membrane proteins with potential to provide mechanistic insights and facilitate drug discovery programmes in the future. C_LI GRAPHICAL ABSTRACT O_FIG O_LINKSMALLFIG WIDTH=200 HEIGHT=103 SRC="FIGDIR/small/724797v1_ufig1.gif" ALT="Figure 1"> View larger version (20K): org.highwire.dtl.DTLVardef@12db163org.highwire.dtl.DTLVardef@d8efb3org.highwire.dtl.DTLVardef@610dbaorg.highwire.dtl.DTLVardef@1cc3ce4_HPS_FORMAT_FIGEXP M_FIG C_FIG
Polley, A.; Ravikumar, A.; Shanmugam, S.
Show abstract
Liposomes are self-assembled lipid vesicles capable of encapsulating both hydrophilic and hydrophobic therapeutics, making them versatile platforms in drug delivery and biomedical technology. In this study, the limitations of the classical thin-film hydration method were critically evaluated, and a sustainable, systematically optimized strategy was established for generating defined liposomal lamellar phases. Hydration conditions were optimized, and 4 mL of buffer per 10 mg of lipid was determined to be optimal for effective rehydration and improved statistical reliability of vesicle measurements. A refined probe-sonication protocol (20% amplitude, 5 s ON/55 s OFF pulse) enabled controlled transformation of multivesicular vesicles into stable multilamellar and unilamellar vesicles at net ON-times of 90 s and 185 s, respectively, without overheating or contamination. In addition, a Python-based machine-learning tool was developed for vesicle size characterization. Collectively, these optimizations provided a reproducible and sustainable framework for preparing liposomes across different lamellar phases.
Julian, R. K.; Rappold, B. A.; Yi, F.; Master, S. R.
Show abstract
Detection of low-level analytes in complex chromatographic-mass spectrometric data requires a criterion to discern apparent peaks from background. Conventional signal-to-noise criteria rely on simple, constant-variance noise models and overlook spurious peaks generated by chemical noise and co-eluting interferences. We introduce a wavelet-based Monte Carlo technique for determining the statistical significance of SRM LC-MS/MS peaks in the presence of structured chemical noise. The method empirically characterizes chemical-noise peaks in samples and builds a generative noise-only null model. Monte Carlo resampling of the noise model assigns p-values that are controlled for the family-wise type I error rate (FWER). We validated the method with SRMs from a dilution series of drug compounds in plasma with known ground-truth concentrations. Triplicate technical replicates were used, spanning concentrations from far above the limit of detection to far below it. Peaks with adjusted p < 0.05 matched the expectation for true positives above the detection limit. Peaks below the limit of detection matched matrix blanks as true negatives, and intermittent detection in the transition region was observed. An independent external validation using a clinical pain panel confirmed the method detects ketamine in confirmed positive samples with signal intensity below the lowest calibration standard while correctly classifying matrix blanks and biological negatives. As a demonstration, we applied our method to a recently published lipid mediator data set. By replacing subjective noise-region selection with a formal hypothesis test against an empirical null model, the method provides an objective and reproducible criterion for deciding whether peak integration is warranted.
Krishnan, S.; Kambekar, A.; Khandelwal, J.; Pushpavanam, K. S.
Show abstract
Solid-phase peptide synthesis (SPPS) remains the dominant technique for peptide production. However, its reliance on hazardous organic solvents such as N, N-dimethylformamide (DMF) and dichloromethane (DCM) results in an adverse environmental burden. One potential approach is replacing these organic solvents with water to reduce the hazardous solvent consumption and improve the environmental footprint of peptide production. This has led to the emergence of aqueous solid-phase peptide synthesis (ASPPS) approaches. Although successful, these approaches require specialized hydrophilic resins or modified building blocks, limiting their industrial applicability and scalability. Moreover, conventional hydrophobic polystyrene supports, remain the most widely used solid supports in industrial SPPS due to their high loading capacity, mechanical robustness, and low cost. These resins are generally considered incompatible with aqueous conditions. Here, we demonstrate that industrially relevant 2-chlorotrityl chloride (CTC) polystyrene resin can support efficient peptide coupling under fully aqueous conditions by integrating a precipitate-free 1-Ethyl-3-(3-dimethylaminopropyl) carbodiimide hydrochloride (EDC{middle dot}HCl) and Oxyma activation system with a synergistic thermal-acoustic strategy. We posit that heating combined with ultrasonic irradiation likely promotes transient relaxation of the polystyrene matrix and enhances water penetration. This facilitates the diffusion of activated amino acid esters onto the hydrophobic resin required for coupling. The robustness of this aqueous methodology was validated through the synthesis of nine structurally diverse peptide sequences, including aromatic hydrogel-forming peptides, opioid peptides derived from enkephalins, toxin-inspired sequences, and a lipid-interacting fragment of -synuclein. Analytical characterization by HPLC and MALDI-TOF mass spectrometry confirmed successful peptide assembly with high crude purity. We anticipate that this thermal-acoustic aqueous SPPS strategy provides a scalable and accessible pathway toward sustainable peptide manufacturing on classical hydrophobic supports with aqueous chemistry.
Pailozian, K.; Kohout, P.; Damborsky, J.; Mazurenko, S.
Show abstract
MotivationProtein melting temperature (Tm) prediction accelerates the discovery of thermostable enzymes which are crucial for industrial biotechnology often requiring harsh reaction conditions. Experimental determination of Tm remains labour-intensive and varies across techniques, motivating the development of in silico predictors. Mass-spectrometry datasets such as Meltome Atlas now enable large-scale Tm prediction with models based on deep learning, but model generalisation across diverse experimental datasets has not been systematically tested. ResultsWe evaluated the generalisability of state-of-the-art deep learning approaches and explored ESM-based embeddings for Tm prediction. To this end, we assembled the ProMelt training dataset (45 441 proteins) and five independent biophysics-based validation datasets. Our analysis revealed substantial differences between proteomics- and biophysics-based Tm measurements, highlighting the challenge of cross-domain generalisation. Existing state-of-the-art predictors trained on large-scale proteomics datasets showed reduced performance on biophysics-based validation sets. Our fine-tuned embedding-based models, particularly LoRA-adapted ESM-2 (TmProt 1.0), outperformed state-of-the-art predictors in identifying thermostable proteins (Tm[≥] 60 {degrees}C) across heterogeneous datasets, achieving AUC scores of 0.75-0.77. We also demonstrated that the available models could be used efficiently in the sequence prioritization task. AvailabilityThe TmProt web server is available at https://loschmidt.chemi.muni.cz/tmprot/. Source code and data are available at https://github.com/loschmidt/TmProt.
Courtney, K. C.; Valentine, S. J.; Li, P.; Woehrling, A.; Ahmed, S.
Show abstract
Native mass spectrometry (nMS) is a powerful tool for analyzing biomolecules and their complexes under near native conditions. The preservation of the native state depends strongly on the ionization methods used to transfer intact molecules from solution to gas phase. In this work, capillary vibrating sharp-edge spray ionization (cVSSI)- based nMS and in-droplet hydrogen deuterium exchange mass spectrometry (HDX-MS) were used to evaluate calcium-dependent interactions between calmodulin and calmidazolium (CDZ). We found that cVSSI produced a narrow charge-state-distribution (CSD) with low average charge states indicating that this method preserved the native-like state. cVSSI was also able to resolve stepwise Ca2+-binding containing one to four Ca2+-bound species of the protein. In absence of Ca2+, no detectable CDZ-binding was observed. However, CDZ-binding was observed when calmodulin was fully loaded with Ca2+. CDZ-binding to the protein caused marked redistribution of the CSD toward lower charge states, consistent with ligand-induced stabilization of the protein into a more compact conformation. The apparent dissociation constant (Kd) of the interaction was determined to be 261 {+/-} 29 nM and 126 {+/-} 17 nM from Langmuir and quadratic binding models, respectively. Complementary in-droplet HDX-MS showed an approximately 23% reduction in deuterium uptake upon ligand binding indicating reduced solvent accessibility and increased structural stabilization supporting nMS findings. Together, these results demonstrate that cVSSI-based nMS coupled with in-droplet HDX-MS provides an integrated platform for simultaneously resolving metal loading, ligand binding, binding affinity, and ligand-induced conformational changes. This approach complements traditional structural methods by enabling direct interrogation of dynamic, metal-dependent protein-ligand interactions in their native states.
Graves, S.; Jasinski, M.; Olsen, E.; Kamanzi, A.; Zhang, Y.; Leung, J.; Venier-Karzis, M.; Safaeesirat, A.; Cullis, P.; Leslie, S. R.
Show abstract
The optimization of mRNA-lipid nanoparticles (mRNA-LNPs) for therapeutic applications is limited in part by the inadequate characterization of mRNA payload heterogeneity. One current challenge is accurately measuring the number of mRNA copies within individual LNPs, where the standard method of intensity-based mRNA number determination is sensitive to fluorescent dye-dye interactions and heterogeneity of mRNA labeling. Here we present a single-particle microscopy method that combines direct counting of the mRNA copies per LNP with LNP size measurements. While confined in microwells, individual mRNA-LNPs are lysed to release their cargo and stained with a dye such that the number of mRNA molecules in each well can be directly counted using fluorescence microscopy. Since the method stains the mRNA cargo in situ, it enables characterization of LNPs formulated with therapeutic grade (e.g., unlabeled) mRNA. We applied this approach to two Onpattro(R)-based LNP formulations prepared using different formulation buffers, where the two formulations had different average mRNA copy number, particle size, and fraction of LNPs lacking mRNA. The ability to directly count the number of mRNA molecules in LNPs establishes a complimentary method to intensity-based mRNA number determination and supports the characterization and screening of clinically relevant LNP formulations.
Cheng, C.-Y.; Chen, Y.-A.; Li, F.-Y.; Re, S.
Show abstract
Rapid and accurate prediction of protein-ligand bindings is essential for drug discovery. While generative AI has driven rapid advancements in structure-based approaches, sequence-based methods remain significantly faster and more cost-effective. Here, we present a weakly supervised deep learning framework integrating graph convolutional networks (GCN) for molecular encoding and bidirectional long short-term memory (BiLSTM) for protein modeling. The latter represents long-range dependencies better than the widely used convolutional neural network (CNN). Leveraging a bilinear attention network (BAN), this model learns protein-ligand pairwise interactions without requiring three-dimensional structural supervision. By using the publicly available BindingDB dataset, the model was trained, solely on affinity labels, and successfully classified binder and non-binders with AUROC of 0.96 and an AUPRC of 0.95. The model generates interpretable attention maps that serve as a "GPS" to locate binding sites. Remarkably, despite the lack of structural training data, it can pinpoint key contact residues confirmed by crystal structures. Our method could function as a scalable filter for giga-scale libraries, allowing rapid screening of drug candidates with direct structural insights into the protein-ligand interface.
Nikolovski, M.; Wang, T.; Sue, A.; MacRenaris, K.; Zhao, H.; O'Halloran, T.; Hu, J.
Show abstract
The rapid expansion of human genomic data has revealed a large number of naturally occurring variants, creating a major challenge for functional annotation. The human metal transporter SLC39A8 (ZIP8) is a clinically important, promiscuous divalent metal transporter, yet most of its documented variants remain uncharacterized. Here, we developed a workflow to functionally evaluate ZIP8 variants by integrating laser ablation inductively coupled plasma time-of-flight mass spectrometry (LA-ICP-TOF-MS) with scaled-up cell-based transport assays. Using this method, we systematically analyzed 33 naturally occurring missense variants located in the extracellular domain (ECD) of ZIP8. The assay enables direct quantification of intracellular metal accumulation with substantially improved throughput ([~]150 samples per hour). Functional screening identified 14 potential pathogenic variants with significantly reduced transport activity. Comparison with computational predictions revealed a moderate correlation between activity and AlphaMissense pathogenicity scores (R2 = 0.423), while an error rate of [~]20% underscores the need for experimental validation. Flow cytometry analysis showed that most loss-of-function variants exhibit impaired trafficking of the protein to the cell surface possibly due to mutation-caused protein misfolding or instability. Structural mapping of activity-compromised variants, together with functional assessment of the ZIP8-ECD, highlights the importance of this domain in ZIP8 expression and intracellular trafficking. Together, this work establishes a scalable approach for functional screening of metal transporter variants and provides new insights into the structure-function relationships of ZIP8.
Bellaiche, A.; Choudhary, P.; Nair, S.; Harrus, D.; Yu, C. W.-H.; Tanweer, S. A.; Evans, G. L.; Lo, S. W.; Martin, M.; Fleming, J. R.; Velankar, S.
Show abstract
Structure Integration with Function, Taxonomy and Sequences (SIFTS) provides residue-level mappings between UniProt Knowledgebase sequences and Protein Data Bank structures and has historically been generated through internal Protein Data Bank in Europe (PDBe) pipelines. Here, PDBe-SIFTS is presented as a fully open-source, locally deployable implementation of this mapping framework. The pipeline combines fast, scalable sequence search using MMseqs2, an improved bounded scoring scheme for ranking candidate mappings, and residue-level mapping refinement based on backbone connectivity. PDBe-SIFTS is distributed as a Python package with command-line tools for 1) building a sequence search database, 2) identifying the best sequence-structure match, 3) one-to-one mapping at the residue level, and 4) generating SIFTS annotations in PDBx/mmCIF format. Benchmarking on the complete Protein Data Bank archive showed that MMseqs2 reduced archive-scale UniProtKB searches from hours with BLASTP to minutes, approximately 22-36 times faster, while curated mappings were recovered at top rank in 93.1% of cases. The remaining discrepancies mainly involved biologically ambiguous cases such as highly conserved proteins, chimeric constructs, or closely related orthologs. These results show that PDBe-SIFTS enables fast mapping, improving structural coherence in residue-level alignments while delivering the most up-to-date and accurate mappings, comparable to expert curation. Tool: https://github.com/PDBeurope/SIFTS Quick start notebook with example: https://github.com/PDBeurope/SIFTS/tree/master/notebooks Broader audience statementMatching protein sequences to their three-dimensional structures, and mapping annotations across both, is essential for understanding protein function, interactions, and molecular mechanisms. This integrated view enables richer interpretation of biological data and underpins advances in drug discovery, disease research, and protein engineering. PDBe-SIFTS provides an open and functional framework for structure-sequence mapping, allowing researchers and databases to run, inspect, and extend these mappings locally, while benefiting from faster searches, transparent scoring, and structurally informed residue-level alignments. Graphical abstract O_FIG O_LINKSMALLFIG WIDTH=200 HEIGHT=110 SRC="FIGDIR/small/721839v1_ufig1.gif" ALT="Figure 1"> View larger version (25K): org.highwire.dtl.DTLVardef@5e6ea6org.highwire.dtl.DTLVardef@1b2754dorg.highwire.dtl.DTLVardef@1334f9forg.highwire.dtl.DTLVardef@1b083a1_HPS_FORMAT_FIGEXP M_FIG C_FIG
Qian, Q.; Peng, J.; Ma, D.; Liu, K.; Cheng, Y.; Deng, Y.; Zhao, J.; Su, S.; Yao, Y.; Qu, Y.; Fu, R.; Liu, J.; Zhao, M.; Xiao, Y.; Wang, K.; Wu, Y.; Wang, Y.; Xu, Q.; Wang, J.; Hay, D. C.; Ke, Y.; Wang, Y.; Shipston, M. J.; Chi, Y.
Show abstract
The function of proteins, the building blocks of life, in health and disease depends not only on their 3D-conformational states but most importantly on the dynamic transition between states controlled by a wide array of post-translational modifications (PTMs). Recent major advances have been made in our ability to predict static 3D structures; however, understanding and predicting the impact of PTMs on protein conformational dynamics remains a major question and challenge in the field. Molecular dynamics (MD) simulation remains the major computational approach for studying protein dynamics. However, the high computational cost, lack of integration of PTMs as conditioning inputs and inefficient generation of continuous protein dynamics largely precludes PTM-regulated conformational dynamics and the study of slow conformational processes. To address this critical bottleneck, we developed ProteinFlux, a flow-matching generative framework that links PTM-conditioned conformational dynamics to evolutionary constraints encoded by PTM sites. Evolutionary information plays a critical role in capturing conformational dynamics beyond sequence identity, and PTM sites inherently encode evolutionary constraints critical to protein functional regulation. We therefore built FluxSite, a dual-modal PTM site predictor that integrates sequence evolutionary information and 3D structural features to generate a continuous conditional signal encoding conservation and functional importance for each predicted site. FluxSite achieves robust generalization across 18 PTM types and 30 disease-associated proteomes. ProteinFlux generates phosphorylation-conditioned, all-atom conformational trajectories across diverse protein fold classes, faithfully reproducing both thermodynamic properties such as free energy landscapes and kinetic features such as conformational transition pathways. It outperforms state-of-the-art predictors while achieving inference speeds several orders of magnitude faster than traditional MD. In addition, we introduce DynaMo-phos, a benchmark dataset of phosphorylated protein MD simulations. Together, ProteinFlux, FluxSite and DynaMo-phos provide a scalable, high-throughput platform for elucidating PTM-driven conformational mechanisms, with potential applications across allosteric drug design, functional annotation of disease-associated modifications and mechanism-guided therapeutic development.
Znabu, B. F.; Atif, Z.
Show abstract
Native mass spectrometry is a central analytical method for characterizing intact proteins, antibody-drug conjugates, and non-covalent assemblies, and it is increasingly the deciding measurement in biotherapeutic development pipelines. A single screening attempt requires days of expression, purification, and buffer exchange into ammonium acetate, followed by 30 to 60 minutes of optimization on a Q-Exactive UHMR or comparable instrument. To our knowledge, no published sequence-based predictor currently estimates native MS suitability before experimental screening. We curated 634 unique proteins with documented native MS outcomes, drawn from a 232-protein hand-curated base set, 358 entries recovered from RCSB PDB by full-text searching for native MS terminology, and 44 evidence-based extractions from supplementary tables across 80 EuropePMC papers. We trained four model variants on this benchmark: a 36-feature BioPython physicochemical baseline, an ESM-2 linear probe, an ESM-2 PCA-256 random forest, and a combined model that concatenates ESM-2 PCA components with BioPython features. All variants were evaluated under cluster-aware 5-fold cross-validation (GroupKFold over ESM-2 embedding-similarity clusters) with isotonic calibration, and standard stratified 5-fold cross-validation is reported as a sensitivity analysis. Under cluster-aware 5-fold cross-validation (GroupKFold over ESM-2 embedding-similarity clusters, our defense against homology leakage), the combined model achieved an AUC of 0.869 plus or minus 0.036, robust against the original stratified-CV value (0.873) and the BioPython baseline (0.852). The ESM-2-only variants showed AUC drops of 0.024 to 0.046 between stratified and cluster-aware splits, indicating that some of the apparent ESM-2 contribution under standard CV reflects homology leakage. Negative recall was 9.4 percent under cluster-aware splitting versus 26.0 percent under stratified, confirming that the models apparent failure-detection capability was substantially inflated by within-fold homology. We report both numbers and treat the cluster-aware values as the primary results. We release the curated dataset, the trained model, and an interactive web tool at nativeready.netlify.app. In its current form, NativeReady should be interpreted primarily as a positive-suitability triage tool; failure prediction remains limited by the scarcity of experimentally documented negative cases. We propose a user-contribution mechanism to accumulate real failure data over time. To our knowledge, no published sequence-based predictor currently estimates native MS suitability before experimental screening, and NativeReady is the first open benchmark and triage model specifically designed for this task.
Cens Holste, S.; Dos Santos, L.; Charan, M. R.; Nyhegn-Eriksen, O.; Crouigneau, R.; Kragelund, B. B.; Marie, R.; Sandelin, A.; Auxillos, J. Y.; Pedersen, S. F.
Show abstract
Extracellular pH is a key microenvironmental factor shaping cell physiology and disease, creating a need for quantitative biosensors that can capture dynamic changes in pHe at the surface of individual living cells. Here, we develop a genetically encoded, ratiometric extracellular pH biosensor through systematic screening of a modular library of membrane-display designs that combine SEpHluorin with a pH-stable reference fluorophore. Screening identified a cell-surface-localised mKate2-SEpHluorin construct, named SurpHer, that exhibits dynamic ratiometric responses across the pHe range of 6 - 7.8. SurpHer shows robust membrane localisation and extracellular pH responsiveness across diverse human cell types including HEK293T, PANC-1 and MDA-MB231 cells. Following stable integration in MDA-MB-231 cells, SurpHer enabled time-course imaging of pHe gradients in a microfluidic platform for modelling tumour microenvironments. SurpHer enables real-time interrogation of the pericellular pH environment of tumor cells and, more broadly, provides a strategy to probe microenvironmental pH dynamics across diverse biological contexts.